-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adding generate operation #3573
base: master
Are you sure you want to change the base?
feat: adding generate operation #3573
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3573 +/- ##
==========================================
- Coverage 54.49% 47.63% -6.86%
==========================================
Files 134 236 +102
Lines 12329 19788 +7459
==========================================
+ Hits 6719 9427 +2708
- Misses 5116 9478 +4362
- Partials 494 883 +389
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Biggest concern is the timestamp stuff.
DefaultGenerateVAP = flag.Bool("default-create-vap-for-templates", false, "Create VAP resource for template containing VAP-style CEL source. Allowed values are false: do not create Validating Admission Policy unless generateVAP: true is set on constraint template explicitly, true: create Validating Admission Policy unless generateVAP: false is set on constraint template explicitly.") | ||
log = logf.Log.V(logging.DebugLevel).WithName("controller").WithValues(logging.Process, "constraint_controller") | ||
discoveryErr *apiutil.ErrResourceDiscoveryFailed | ||
DefaultWaitForGeneration = flag.Int("default-wait-for-generation", 30, "Wait to generate ValidatingAdmissionPolicyBinding after the constraint is created. Defaults to 30 seconds.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should not include the hardcoded time in the error text -- prone to bitrot.
return reconcile.Result{}, err | ||
} | ||
currentVapBinding = nil | ||
|
||
if currentVapBinding == nil && instance.GetCreationTimestamp().Add(time.Duration(*DefaultWaitForGeneration)).Before(time.Now()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not rely on the creation timestamp... that leaves us vulnerable to clock skew since that is set by an unknown machine.
Also, as discussed in the OSS mtg, the time delay should be based off of the constraint template, specifically wait time seconds after the CRD is created from the template. There is no need to wait due to the constraint.
if generateVAPB && groupVersion != nil { | ||
currentVapBinding, err := vapBindingForVersion(*groupVersion) | ||
if operations.IsAssigned(operations.Generate) { | ||
err := util.ValidateEnforcementAction(enforcementAction, instance.Object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VAP generation should probably be in its own function -- the reconcile function is unwieldy at this point.
Also, I'm not sure we want to only validate enforcement action when generate is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored the code
return reconcile.Result{}, err | ||
|
||
if operations.IsAssigned(operations.Generate) { | ||
isVapAPIEnabled := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably move generation lifecycle stuff to its own function.
Makefile
Outdated
@@ -70,6 +70,7 @@ MANAGER_IMAGE_PATCH := "apiVersion: apps/v1\ | |||
\n - --exempt-namespace=${GATEKEEPER_NAMESPACE}\ | |||
\n - --operation=webhook\ | |||
\n - --operation=mutation-webhook\ | |||
\n - --operation=generate\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is added to controller-manager instead of audit. was this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was supposed to be added to audit pod. Fixed it.
a0fd55a
to
a0d015e
Compare
cmd/build/helmify/static/README.md
Outdated
@@ -171,6 +171,7 @@ information._ | |||
| enableK8sNativeValidation | Enable the K8s Native Validating driver to allow constraint templates to use rules written in VAP-style CEL (beta feature) | `true` | | |||
| defaultCreateVAPForTemplates | (alpha) Create VAP resource for template containing VAP-style CEL source. Allowed values are false: do not create Validating Admission Policy unless generateVAP: true is set on constraint template explicitly, true: create Validating Admission Policy unless generateVAP: false is set on constraint template explicitly. | `false` | | |||
| defaultCreateVAPBindingForConstraints | (alpha) Create VAPBinding resource for constraint of the template containing VAP-style CEL source. Allowed values are false: do not create Validating Admission Policy Binding, true: create Validating Admission Policy Binding. | `false` | | |||
| defaultWaitForVAPBGeneration | (alpha) Wait to generate ValidatingAdmissionPolicyBinding after the constraint CRD is created. | `30` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| defaultWaitForVAPBGeneration | (alpha) Wait to generate ValidatingAdmissionPolicyBinding after the constraint CRD is created. | `30` | | |
| defaultWaitForVAPBGeneration | (alpha) Wait time in seconds before generating a ValidatingAdmissionPolicyBinding after a constraint CRD is created. | `30` | |
logger = log.Log.V(logging.DebugLevel).WithName("controller").WithValues("kind", "ConstraintTemplate", logging.Process, "constraint_template_controller") | ||
discoveryErr *apiutil.ErrResourceDiscoveryFailed | ||
logger = log.Log.V(logging.DebugLevel).WithName("controller").WithValues("kind", "ConstraintTemplate", logging.Process, "constraint_template_controller") | ||
defaultWaitForVAPBGeneration = flag.Int("default-wait-for-vapb-generation", 30, "(alpha) Wait to generate ValidatingAdmissionPolicyBinding after the constraint CRD is created.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -574,6 +483,132 @@ func (r *ReconcileConstraint) reportErrorOnConstraintStatus(ctx context.Context, | |||
return err | |||
} | |||
|
|||
func (r *ReconcileConstraint) generateVAPB(ctx context.Context, enforcementAction util.EnforcementAction, instance *unstructured.Unstructured, status *constraintstatusv1beta1.ConstraintPodStatus) (time.Duration, error) { | |||
ret := time.Duration(0) | |||
if !operations.IsAssigned(operations.Generate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if none of the controllers has the generate
operation, how do we report this issue to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added log statement for now. Another option is to figure the same out by looking at all CTStatus resource in CTstatus controller - https://github.com/open-policy-agent/gatekeeper/blob/master/pkg/controller/constrainttemplatestatus/constrainttemplatestatus_controller.go#L185. However we might need one field under CT Status to post the error.
if err != nil { | ||
return err | ||
} | ||
if t.Before(currentTime.Add(time.Duration(*defaultWaitForVAPBGeneration*2) * time.Second)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this * 2
? can you add a comment for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified this to remove *2, added the comment as well.
|
||
// waiting for sometime before generating vapbinding, gives api-server time to cache CRDs | ||
timestamp := ct.Annotations[BlockVAPBGenerationUntilAnnotation] | ||
if timestamp != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if it's not populated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this behavior.
func (r *ReconcileConstraintTemplate) updateTemplateWithBlockVAPBGenerationAnnotations(ctx context.Context, ct *v1beta1.ConstraintTemplate) error { | ||
currentTime := time.Now() | ||
switch { | ||
case ct.Annotations == nil || ct.Annotations[constraint.BlockVAPBGenerationUntilAnnotation] == "": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be a switch statement or can a if
clause work better since some of the code below seems to be repeated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code to if stmt
// currentTime := time.Now() | ||
// if currentTime.Before(blockTime) { | ||
// t.Fatal("VAPBinding should not be created before the timestamp", currentTime, blockTime) | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the commented block. Updated the tests as well.
return err | ||
} | ||
// if wait time is within the time window to generate vap binding, do not update the annotation | ||
if t.Before(currentTime.Add(time.Duration(*constraint.DefaultWaitForVAPBGeneration) * time.Second)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if annotation is already populated, then can we always skip an update? what's the use case for when the annotation needs to be updated with a new time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ensures that VAPB is generated if a faulty actor has set the timestamp in the future manually—beyond the allowed delay configured by the flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the context and it would be useful to add that as a comment.
default: | ||
// reconcile after default wait time for vapb generation if annotation is not set | ||
if ct.Annotations == nil || ct.Annotations[BlockVAPBGenerationUntilAnnotation] == "" { | ||
return time.Duration(*DefaultWaitForVAPBGeneration) * time.Second, r.reportErrorOnConstraintStatus(ctx, status, errors.New("annotation to wait for ValidatingAdmissionPolicyBinding generation not found"), "could not find annotation to wait for ValidatingAdmissionPolicyBinding generation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this returned time duration is ignore when an error is returned. we should be consistent to always rely on the annotation for the timestamp instead of returning a duration here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose here was to requeue after user defined delay in case wanted annotations are not found (wait for annotation to be set by CT constroller if annotation was not present for any reason). I updated the code to reflect the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have a user-defined delay when annotation is missing, that should be handled by controller code (user sets time delay after CRD gets created, not internal controller mechanics)
Signed-off-by: Jaydip Gabani <[email protected]>
2947f62
to
d7e2964
Compare
Signed-off-by: Jaydip Gabani <[email protected]>
return reconcile.Result{}, r.reportErrorOnConstraintStatus(ctx, status, err, fmt.Sprintf("could not delete ValidatingAdmissionPolicyBinding: %s", vapBindingName)) | ||
} | ||
} | ||
if requeueAfter != time.Duration(0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we are requeueing before cacheConstraint
is called (which adds the constraint to the constraint framework), the constraint will not be enforced at all until the VAP objects are created. These seems unnecessary and will hurt the performance of existing G8r uses.
return reconcile.Result{}, r.reportErrorOnConstraintStatus(ctx, status, err, fmt.Sprintf("could not update ValidatingAdmissionPolicyBinding: %s", vapBindingName)) | ||
} | ||
} | ||
requeueAfter, err := r.generateVAPB(ctx, enforcementAction, instance, status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the VAP generation logic is gated on reflect.DeepEqual()
of the constraint as-cached by the constraint framework.
I don't think that is the right logic gate for VAP-gen logic -- it should be gated on whether the extant VAP objects have drifted from the to-be-generated VAP objects. Otherwise users could modify the generated VAP objects and G8r would not re-align them until the next time the constraint is touched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we gate actual writes on VAP binding having a diff --- could just always call generateVAP() code. Maybe we could avoid unnecessary logic execution and therefore improve performance by refactoring more, but always calling VAP gen should be sufficient.
Bonus points for calling it after adding the constraint to the constraint framework -- that avoids blocking all enforcement on VAP succeeding as mentioned elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to call generateVAP
out side of reflect.DeepEqual()
gate. Also calling generateVAP
after constraint
is marked enforced.
hasVAP, err := ShouldGenerateVAP(unversionedCT) | ||
switch { | ||
case errors.Is(err, celSchema.ErrCodeNotDefined): | ||
generateVAPB = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this error be swallowed? Or should we tell the user that the constraint template does not support VAP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will mean when we switch the default to always generate VAP and VAPB, we will be throwing error for existing users using constraint template with Rego. I can also see the other use-case, where we would want to notify users if the intent is to use VAP but there is no CEL in the CT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we not switch to a model where users must use the VAP enforcement point to use VAP? I remember having that discussion. Part of the reason for requiring explicit user intent was to avoid questions like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I think an error or "warning"-type error makes sense. Users can use scoped EA to disable it.
Contrast that with users getting no notice that a Rego-sourced CT is incompatible with VAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should check if engine is K8sNativeValidation, then log and report the error. I recall this error was swallowed in 3.17.1 because 3.17.0 was reporting this error for existing rego CTs.
default: | ||
// reconcile after default wait time for vapb generation if annotation is not set | ||
if ct.Annotations == nil || ct.Annotations[BlockVAPBGenerationUntilAnnotation] == "" { | ||
return time.Duration(*DefaultWaitForVAPBGeneration) * time.Second, r.reportErrorOnConstraintStatus(ctx, status, errors.New("annotation to wait for ValidatingAdmissionPolicyBinding generation not found"), "could not find annotation to wait for ValidatingAdmissionPolicyBinding generation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have a user-defined delay when annotation is missing, that should be handled by controller code (user sets time delay after CRD gets created, not internal controller mechanics)
vapBindingName := fmt.Sprintf("gatekeeper-%s", instance.GetName()) | ||
log.Info("check if vapbinding exists", "vapBindingName", vapBindingName) | ||
if err := r.reader.Get(ctx, types.NamespacedName{Name: vapBindingName}, currentVapBinding); err != nil { | ||
if !apierrors.IsNotFound(err) && !errors.As(err, &discoveryErr) && !meta.IsNoMatchError(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discovery/nomatch errors should not be a concern for bindings (which are a built-in type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to remove the error matching.
return reconcile.Result{}, r.reportErrorOnConstraintStatus(ctx, status, err, fmt.Sprintf("could not update ValidatingAdmissionPolicyBinding: %s", vapBindingName)) | ||
} | ||
} | ||
requeueAfter, err := r.generateVAPB(ctx, enforcementAction, instance, status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we gate actual writes on VAP binding having a diff --- could just always call generateVAP() code. Maybe we could avoid unnecessary logic execution and therefore improve performance by refactoring more, but always calling VAP gen should be sufficient.
Bonus points for calling it after adding the constraint to the constraint framework -- that avoids blocking all enforcement on VAP succeeding as mentioned elsewhere.
return err | ||
} | ||
// if wait time is within the time window to generate vap binding, do not update the annotation | ||
// otherwise update the annotation with the current time + wait time. This protects against manual updates on annotations with a timestamp that prevents binding from getting generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, more accurately, this prevents clock skew from preventing generation on task reschedule.
This design really doesn't protect against malicious users (RBAC for templates does that)
return nil | ||
} | ||
|
||
err := r.updateTemplateWithBlockVAPBGenerationAnnotations(ctx, ct) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably note that we are purposefully making a second call to the API server here in order to make sure the timestamp is post-CRD creation.
Otherwise the question I have is "why make two requests?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment here.
if ct.Annotations == nil { | ||
ct.Annotations = make(map[string]string) | ||
} | ||
ct.Annotations[constraint.BlockVAPBGenerationUntilAnnotation] = currentTime.Add(time.Duration(*constraint.DefaultWaitForVAPBGeneration) * time.Second).Format(time.RFC3339) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also have an annotation that is non-clock-dependent, that lets us know when generation is always allowed, otherwise we run the risk of sporadic delays when the pod reschedules.
Likely a significant change, probably not a blocker for beta (just a known risk)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the annotations are set after CRD is created, pod reschedules won't cause the annotations to be set again. In the event of pod getting reschedule after CRD is created and before the annotations are set, annotations will get set on the restart and then it won't get updated again.
Since we only update annotations if it is not set or it is in future outside of the window defined by users, I am failing to see the need of non-clock-dependent
annotation. Can you describe an example where we run the risk of sporadic delays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember the discussion about clock skew. Pod gets rescheduled to a node that is a day behind -> sporadic 30-second delays generating bindings, even if the CRD has been extant for hours.
Signed-off-by: Jaydip Gabani <[email protected]>
Signed-off-by: Jaydip Gabani <[email protected]>
default: | ||
// reconcile for vapb generation if annotation is not set | ||
if ct.Annotations == nil || ct.Annotations[BlockVAPBGenerationUntilAnnotation] == "" { | ||
return time.Duration(1) * time.Second, r.reportErrorOnConstraintStatus(ctx, status, errors.New("annotation to wait for ValidatingAdmissionPolicyBinding generation not found"), "could not find annotation to wait for ValidatingAdmissionPolicyBinding generation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per previous comment, this should return a duration of zero... let the controller decide what the retry delay is.
return err | ||
} | ||
if ct.Annotations != nil { | ||
ct.Annotations[constraint.BlockVAPBGenerationUntilAnnotation] = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to reset this annotation when the CRD needs an update? the CRD should already been created and known to the apiserver right? and therefore, no need to wait for the vapb creation/updates?
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, using
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when the PR gets merged):Fixes #3659
Related #3501
Special notes for your reviewer: